34 research outputs found
Temporal information in newswire articles : an annotation scheme and corpus study.
Many natural language processing applications, such as information extraction, question
answering, topic detection and tracking, would benefit significantly from the ability to
accurately position reported events in time, either relatively with respect to other events or
absolutely with respect to calendrical time. However, relatively little work has been done
to date on the automatic extraction of temporal information from text.
Before we can progress to automatically position reported events in time, we must gain
an understanding of the mechanisms used to do this in language. This understanding can
be promoted through the development of all annotation scheme, which allows us to identify
the textual expressions conveying events, times and temporal relations in a corpus of 'real'
text.
This thesis describes a fine-grained annotation scheme with which we can capture all
events, times and temporal relations reported ill a text. To aid the application of the scheme
to text, a graphical annotation tool has been developed. This tool not only allows easy markup
of sophisticated temporal annotations, it also contains an interactive, inference-based
component supporting the gathering of temporal relations. The annotation scheme and the
tool have been evaluated through the construction of a trial corpus during a pilot study. In
this study, a group of annotators was supplied with a description of the annotation scheme
and asked to apply it to a trial corpus.
The pilot study showed that the annotation scheme was difficult to apply, but is feasible
with improvements to the definition of the annotation scheme and the tool. Analysis of
the resulting trial corpus also provides preliminary results on the relative extent to which
different linguistic mechanisms, explicit and implicit, are used to convey temporal relational
information in text
Building a semantically annotated corpus of clinical texts
In this paper, we describe the construction of a semantically annotated corpus of clinical texts for use in the development and evaluation of systems for automatically extracting clinically significant information from the textual component of patient records. The paper details the sampling of textual material from a collection of 20,000 cancer patient records, the development of a semantic annotation scheme, the annotation methodology, the distribution of annotations in the final corpus, and the use of the corpus for development of an adaptive information extraction system. The resulting corpus is the most richly semantically annotated resource for clinical text processing built to date, whose value has been demonstrated through its use in developing an effective information extraction system. The detailed presentation of our corpus construction and annotation methodology will be of value to others seeking to build high-quality semantically annotated corpora in biomedical domains
Extinctions, genetic erosion and conservation options for the black rhinoceros (Diceros bicornis)
The black rhinoceros is again on the verge of extinction due to unsustainable poaching in its native range. Despite a wide historic distribution, the black rhinoceros was traditionally thought of as depauperate in genetic variation, and with very little known about its evolutionary history. This knowledge gap has hampered conservation efforts because hunting has dramatically reduced the species’ once continuous distribution, leaving five surviving gene pools of unknown genetic affinity. Here we examined the range-wide genetic structure of historic and modern populations using the largest and most geographically representative sample of black rhinoceroses ever assembled. Using both mitochondrial and nuclear datasets, we described a staggering loss of 69% of the species’ mitochondrial genetic variation, including the most ancestral lineages that are now absent from modern populations. Genetically unique populations in countries such as Nigeria, Cameroon, Chad, Eritrea, Ethiopia, Somalia, Mozambique, Malawi and Angola no longer exist. We found that the historic range of the West African subspecies (D. b. longipes), declared extinct in 2011, extends into southern Kenya, where a handful of individuals survive in the Masai Mara. We also identify conservation units that will help maintain evolutionary potential. Our results suggest a complete re-evaluation of current conservation management paradigms for the black rhinoceros
The state of the Martian climate
60°N was +2.0°C, relative to the 1981–2010 average value (Fig. 5.1). This marks a new high for the record. The average annual surface air temperature (SAT) anomaly for 2016 for land stations north of starting in 1900, and is a significant increase over the previous highest value of +1.2°C, which was observed in 2007, 2011, and 2015. Average global annual temperatures also showed record values in 2015 and 2016. Currently, the Arctic is warming at more than twice the rate of lower latitudes
Multimodal generation in the COMIC dialogue system
We describe how context-sensitive, usertailored output is specified and produced in the COMIC multimodal dialogue system. At the conference, we will demonstrate the user-adapted features of the dialogue manager and text planner.